Dimensionality reduction of the enhanced feature set for the HMM-based speech recognizer
نویسنده
چکیده
In the past few years, a great deal of research has been directed toward finding acoustic features that are effective for automatic speech recognition. Until recently, most of the speech recognizers used about 12 cepstral coefficients derived through the linear prediction analysis as recognition features [ 11. In [2,3], Furui investigated the use of temporal derivatives of cepstral coefficients and energy as recognition features in a dynamic time warping-based isolated word recognizer and showed how the recognition performance improves with the inclusion of first derivatives in the feature set. These results were later confirmed in a number of studies for more general tasks (such as speaker-independent connected digit recognition and large-vocabulary continuous speech recognition) using the hidden Markov model (HMM)-based speech recognizers [4-61. More recently, some studies which advocate the use of second (and higher)-order temporal derivatives of cepstral coefficients for speech recognition have been reported [ 7-91. These temporal derivatives have also been found useful as recognition features for speaker recognition [lo-121. As a result, most of the present-day speech recognizers use a larger feature set for enhancing the speech recognition performance [13-X]. This feature set usually consists of cepstral coefficients and energy, and their derivatives. Though the addition of new features has improved the speech recognition performance, it has created some problems, too. For example, the recognizer using a larger (or, enhanced) feature set is computation-
منابع مشابه
Feature Transformation Based on Generalization of Linear Discriminant Analysis
Hidden Markov models (HMMs) have been widely used to model speech signals for speech recognition. However, they cannot precisely model the time dependency of feature parameters. In order to overcome this limitation, several researchers have proposed extensions, such as segmental unit input HMM (Nakagawa & Yamamoto, 1996). Segmental unit input HMM has been widely used for its effectiveness and t...
متن کاملAn effective feature compensation scheme tightly matched with speech recognizer employing SVM-based GMM generation
This paper proposes an effective feature compensation scheme to address a real-life situation where clean speech database is not available for Gaussian Mixture Model (GMM) training for a model-based feature compensation method. The proposed scheme employs a Support Vector Machine (SVM)based model selection method to effectively generate the GMM for our feature compensation method directly from ...
متن کاملSpeech enhancement based on hidden Markov model using sparse code shrinkage
This paper presents a new hidden Markov model-based (HMM-based) speech enhancement framework based on the independent component analysis (ICA). We propose analytical procedures for training clean speech and noise models by the Baum re-estimation algorithm and present a Maximum a posterior (MAP) estimator based on Laplace-Gaussian (for clean speech and noise respectively) combination in the HMM ...
متن کاملExperiments on a parametric nonlinear spectral warping for an HMM-based speech recognizer
This paper is concerned with the search for an optimal feature-set for a speech recognition system. A better acoustic feature analysis that suitably enhances the semantic information in a consistent fashion can reduce raw-score (no grammar) error rate sig-niicantly. A simple two-dimensional parameterized feature set is proposed. The feature-set is compared against a standard mel-cepstrum, LPC-b...
متن کاملFeature dimension reduction using reduced-rank maximum likelihood estimation for hidden Markov models
This paper presents a new method of feature dimension reduction in hidden Markov modeling (HMM) for speech recognition. The key idea is to apply reduced rank maximum likelihood estimation in the M-step of the usual Baum-Welch algorithm for estimating HMM parameters such that the estimates of the Gaussian distribution parameters are restricted in a sub-space of reduced dimensionality. There are ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- Digital Signal Processing
دوره 2 شماره
صفحات -
تاریخ انتشار 1992